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SCALING LIMITS OF RANDOM POLYA TREES 


KONSTANTINOS PANAGIOTOU AND BENEDIKT STUFLER 


Abstract. Polya trees are rooted trees considered up to symmetry. We establish the con¬ 
vergence of large uniform random Polya trees with arbitrary degree restrictions to Aldous’ 
Continuum Random Tree with respect to the Gromov-Hausdorff metric. Our proof is short 
and elementary, and it shows that the global shape of a random Polya tree is essentially 
dictated by a large Galton-Watson tree that it contains. We also derive sub-Gaussian tail 
bounds for both the height and the width, which are optimal up to constant factors in the 
exponent. 


1. Introduction and main results 

Any connected graph G with vertex set V can be associated in a natural way with a metric 
space (V,cIg), where dciu^v) is defined as the length of a shortest path that contains u and 
v in G. In this paper we consider the setting where G is a random tree with n vertices, and 
we study, as n —> oo, several properties of the associated random metric space. 

The most prominent and well-studied case that fits in our setting is when G is a critical 
Galton-Watson random tree with n vertices, where the offspring distribution has a finite 
non-zero variance. In the series U®M of seminal papers Aldous proved that the metric 
spaces associated to those trees admit a common and universal limit, the so-called Continuum 
Random Tree (CRT). Since then, the CRT has been shown to be the limit of various families 
of random combinatorial structures, in particular other distributions on trees, see e.g. Haas 
and Miermont [18] and references therein, planar maps, see e.g. Albenque and Marckert [2], 
Bettinelli [7], Caraceni ms, Curien, Haas and Kortchemski m, Janson and Stefansson [20] . 
Stufler m, and certain families of graphs, see Panagiotou, Stufler, and Weller [28) . 

Here we study the class of Polya trees , which are rooted trees (that is, there is a dis¬ 
tinguished vertex called the root) considered up to symmetry, equipped with the uniform 
distribution. They are named after George Polya, who developed a framework based on gen¬ 
erating functions in order to study their properties [29] . The study of these objects, especially 
in random settings, poses significant difficulties: the presence of non-trivial symmetries makes 
it difficult to derive an explicit and handy description of the probability space at hand. To 
wit, random Polya trees do not fit into well-studied models of random trees such as Galton- 
Watson trees, a fact that was widely believed and which was established rigorously by Drmota 
and Gittenberger m- 

The main contribution of this paper is a simple and short proof that establishes the scaling 
limit of random Polya trees, with the additional benefit that it allows us to consider arbitrary 
degree restrictions. That is, we may restrict the outdegrees of the vertices to an arbitrary set 
(always including, of course, 0 and an interger > 2, so that the trees are finite and non-trivial). 
Our proof also reveals a novel striking structural property that is of independent interest and 
that rectifies the common perception of random unlabelled trees. As already mentioned, it is 
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known that random Polya trees do not admit a simple probabilistic description. However, we 
argue that this barely fails to be the case, namely that a random Polya tree “consists” in a 
well-defined sense of a large Galton-Watson tree having a random size to which small forests 
are attached - in other words, the global structure of a large random Polya tree is similar to 
the structure of the Galton-Watson tree it contains. 


Theorem 1.1. Let H be an aribtrary set of nonnegative integers containing zero and at 
least one integer greater than or equal to two. Let A n denote the uniform random Polya tree 
with n vertices and vertex outdegrees in Q. Then there exists a constant cq > 0 such that the 
metric space (A n , cnn _1,/2 dA n ) converges towards the continuum random tree ( T e ,d %) in the 
Gromov-Hausdorff sense as n = 1 mod gcd(fl) tends to infinity. 


In the theorem we use the normalization of Le Gall [23] and let T e denote the continuum 
random tree constructed from Brownian excursion, see Section [2] for the appropriate defini¬ 
tions. We also obtain explicit expressions for the scaling constant cq in our proof, see (5.8) 
and the subsequent equations. 

Random Polya trees were studied in several papers prior to this work. In particular, since 
the construction of the CRT in the early 90’s it was a long-standing conjecture [3J p. 55] that 
this model of random trees (without any degree restrictions) also allows the same scaling limit. 
The convergence of binary Polya trees, that is, when the vertex outdegrees are restricted to 
the set {0, 2}, was established by Marckert and Miermont [26] using an appropriate trimming 
procedure. Later, in [18] the conjecture was proven by using different techniques; actually, 
a far more general result on the scaling limit of random trees satisfying a certain Markov 
branching property was shown. Among other results, the method in m allows also to study 
Polya trees with some degree restrictions, where the vertex outdegrees have to be constrained 
in a set of the form {0,1,..., d} or {0, d} for d > 2. However, the question about the 
convergence of Polya trees with arbitrary degree restrictions was open, and we answer it in 
this work with a simple argument. 

Our next result is concerned with two extremal parameters. The height H(T) of a rooted 
tree T is defined as the maximal distance of a vertex from the root, and the width W(T) is 
the maximal number of vertices at any fixed distance from the root. 


Theorem 1.2. Let H be an aribtrary set of nonnegative integers containing zero and at least 
one integer greater than or equal to two. Let A n denote the uniform random Polya tree with n 
vertices and vertex outdegrees in Ll. Then there are constants C, c > 0 such that 

P (H(A n ) > x) < C exp(— cx 2 /n), P (W(A n ) > x) < C exp(—cx 2 /n) 
for all x > 0 and n = 1 mod gcd(fl). 

Similar bounds were obtained by Addario-Berry, Devroye and Janson [1] for critical Galton- 
Watson trees with finite nonzero variance, conditioned to be large. Our proofs show that these 
bounds are (up to the choice of c, C) best possible. As a direct consequence of our results we 
obtain for the distribution of H that 

cnn- 1 / 2 H(A ? - t )^>H(T e ) and E[H(A n f ] ~ c(/n p / 2 E[H(T e ) p ] 
for all p > 1. The distribution of the height H(7^) is known and given by 

(1.1) H(T e ) @ sup eft), 

0<ti<t 2 <l 
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where e(t) is a Brownian excursion of duration one, and 

(1.2) P (H(7;) > x) = 2 ^(4 k 2 x 2 - 1) exp(—2A;V), 

k> 1 

see [H Ch. 3.1]. Its moments are also known and given by 

E[H(T e )] = ^/ 7 ^ 72 , E[H(T e ) p ] = 2 ~P/ 2 p(p ~ l)r(p/2)C(p) for p > 2. 

This follows from standard results for Brownian excursion by Chung mi, or by results of 
Renyi and Szekeres m Eq. (4.5)]) who calculated the moments of the limit distribution of 
the height of a class of trees that converges towards the CRT; see also |28j. 

Methods. Our method relies on generating random Polya trees using the framework of 
Boltzmann samplers mm- Our main insight is that this allows us to show that with high 
probability, that is, with probability tending to one as n —>■ oo, the shape of the Polya tree A n 
is given by a subtree T n with small subtrees that contain O(logn) vertices attached to each 
vertex. As a metric space, we argue that T n is distributed like a critical Galton-Watson 
tree, whose offspring distribution even has finite exponential moments, conditioned on having 
a randomly drawn size concentrating around n times a constant. In particular, by using 
Aldous’s fundamental result [5], see also [231 [ 25 ] , we obtain that T n converges to a multiple 
of the CRT; moreover, the Gromov-Hausdorff distance of (A n , n~ l / 2 d^ n ) and (T n ,n ~ l/2 dT.) 
converges in probability to zero, yielding the desired result. To prove Theorem |1.2| we then 
use tail-bounds for 14(7^) in order to obtain the corresponding bounds for the height of A n . 

Outline. The paper is structured as follows. In the next section we recall Aldous’ theorem 
regarding the convergence of Galton-Watson trees, and we introduce some notation that will 
be used throughout the paper. Since this paper is targeted to a probabilistic audience, we 
introduce in Section [3] all required combinatorial preliminaries, tailored to our specific aims. 
In Section [4] we give a formal definition of Polya trees and derive the sampling algorithm that 
will be the basis of our analysis. Finally, in Section [5] which is the main novel contribution 
of this paper, we present the proofs of our main theorems. 

2. Aldous’ fundamental theorem 

2.1. Gromov-Hausdorff convergence. The exposition here is based on Pi Ch. 7] and [25]. 
A pointed metric space is a metric space together with a distinguished element that is often 
referred to as its root. A correspondence between two pointed metric spaces X * = ( X , dx,x o) 
and Y* = (Y, dy,y o) is a subset R C X xY such that (xo, Vo) -R, and for each i£l there 
is a point y E Y with (x, y) E R, and for each y E Y there is a point x E X with (x, y) E R. 
The distortion of the correspondence R is given by 

dis(R) = sup \dx(xi,x 2 ) - d Y (yi, y 2 )|- 

{xi,yi),(x2,y2)eR 

If (X,dx) and (Y,dy) are compact, the Gromov-Hausdorff distance between X * and Y* is 
defined by 

d G n{X*,Y 9 ) = ^infdis(R) E [0,oo[, 

Z R 

with the index R ranging over all correspondences between X * and Y*. see P Thm. 7.3.25] 
and [25] Prop. 3.6]. 
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Two pointed metric spaces are isometric , if there exists a distance preserving bijection 
between the two that also preserves the roots. The Gromov-Hausdorff distance does not 
change if one of the spaces is replaced by another isometric copy. Moreover, two pointed spaces 
have Gromov-Hausdorff distance zero, if and only if they are isometric, and the Gromov- 
Hausdorff distance satisfies the axioms of a premetric on the collection of compact pointed 
metric spaces, see |SJ Thm. 7.3.30] and [25] Thm. 3.5]. We may thus view c?gh as a metric on 
the collection K* of all isometry classes of compact pointed metric spaces. 

2.2. The continuum random tree. The continuum random tree (CRT) T e is a random 
metric space that is encoded by the Brownian excursion of duration one. We briefly introduce 
it following [25] [22] . Given an arbitrary continuous function / : [0,1] —> [0, oof satisfying 
/(0) = /(1) = 0 we may define a premetric d on the interval [0,1] given by 

d(u,v) = f(u) + f(v)-2 inf f(s) 

u<s<v 

for u < v. Let (Tf,dj- f ) = ([0, l]/~,d) denote the corresponding quotient space obtained 
by identifying points that have distance zero. We consider this space as being rooted at the 
equivalence class 0 of 0. The random pointed metric space ( T e , d%, 0) coded by the Brownian 
excursion of duration one e = (et)o<t<i is called the Brownian continuum random tree (CRT). 

2.3. Plane trees and Aldous’ theorem. The Ulam-Harris tree is defined as an infinite 
rooted tree with vertex set U ng j^ 0 N n consisting of finite sequences of natural numbers. The 
empty string 0 is its root, and the offspring of any vertex v is given by the concatenations 
{vi : i E N}. In particular, the labelling of the vertices induces a linear order on each offspring 
set. A plane tree is defined as a subtree of the Ulam-Harris tree that contains the root. Any 
plane tree is a pointed metric space with respect to the graph-metric and the root vertex 0. 
Hence random plane trees may be considered as random elements of the metric space 1C*. 

Let £ be a random variable with support on No- Then, a £-Galton-Watson tree T is the 
family tree of a Galton-Watson branching process with offspring distribution £, interpreted 
as a (possibly infinite) plane tree. We call T critical if E[£] = 1. The following invariance 
principle giving a scaling limit for certain random plane trees is due to Aldous [5] and there 
exist various extensions, see for example mmm- 

Theorem 2.1. Let T n be a critical £- Galton- Wats on tree conditioned on having n vertices, 
with the offspring distribution £ having finite non-zero variance a 2 . As n tends to infinity, 
T n with edges rescaled to length converges in distribution to the CRT, that is 

(7ii> 7r-/=dT n ,$) (T e ,dr e ,0) 

2 y/n 

in the metric space (1C*,d&ff)- 

In the following we use a more compact notation, writing aT n and T e when refering to 
(%,adr n ,Q) and (T e ,d%,0). 

2.4. Tail-bounds for the height and width. In [1, Thm. 1.2] the following tail-bounds 
were obtained. 

Theorem 2.2. Let T n be a critical £- Galton- Wats on tree conditioned on having n vertices, 
with the offspring distribution f having finite non-zero variance a 2 . Then there are constants 
C, c > 0 such that for all x > 0 and n 

P (H(7^) > x) < Cexp(— cx 2 /n), P (W(T n ) > x) < C exp(— cx 2 /n). 
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3. Combinatorial preliminaries 

We recall relevant notions and tools from combinatorics. In particular, we discuss con¬ 
structions of combinatorial classes following Joyal m and Flajolet and Sedgewick EH, and 
give a brief account on Boltzmann samplers following Flajolet, Fusy, and Pivoteau |16| . This 
will be our main tools for studying the class of Polya trees. 

3.1. Combinatorial classes and generating series. A combinatorial class C is a set to¬ 
gether with a size-function \ ■ \ : C —> No- We require that for any n E No the subset C n C C 
of all n-sized elements is finite. The ordinary generating series of a class C is defined as the 
formal power series 

C(z) = Y | C n \z n , 

n£ No 

with \C n \ denoting the number of elements of the set C n . We set [z n ]C(z) = \C n \. 

3.2. Permutations. As an example of a combinatorial class, consider 

5 = |J S n 

n£ N 0 

of all permutations with S n denoting the symmetric group of order n. Its ordinary generating 
series is given by 

S(z) = y nlz 11 . 

n£ No 

Recall that any permutation a may be written in an essentially unique way as a product of 
disjoint cycles (corresponding to the orbits of the permutation). In the following we are going 
to let <7j, i > 1 denote the number of cycles of length i, that is, with exactly i elements, in 
this factorization. Here we count fixpoints as 1-cycles. 

3.3. Operations on classes. 

3.3.1. Product classes. Given two classes C and V we may form the product class as the 
set-theoretic product 

C-V = CxV 

with the size-function given by \(C,D)\ = |Cj + \D\ for any C E C and D E V. It is a 
straightforward consequence, see also Chapter 1.1. in m, that 

(C-V)(z)=C(z)V(z). 

3.3.2. Multisets. We may also form the class MSET(C) of all multisets of elements of C, that 
is, sets of the form 

{(Ci, Mi))..., ( C k ,n k )} 

with k e No, Ci,.... C^ E C being pairwise distinct and ni,..., E N. Here the sum X^=i n * 
denotes the number of elements of the multiset. The size-function for multisets is given by 

k 

|{(Ci,ni),..., (C k ,n k )}\ = Y I Ci\ni- 

1=1 

For any subset H C No we may also form the class MSET^(C) C MSET(C) by restricting to 
multisets whose number of elements lies in H. 
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In order to express the ordinary generating series for the class of multisets in C, we require 
the concept of cycle index sums. Recall that for any permutation a we let Oi denote the 
number of cycles of length i of a. 


Definition 3.1. For any subset hi C No define the cycle index sum 

Z n {s 1 ,s 2 ,...) = J2li Z s i 1 ---4 k - 

k£fl ' cr£S k 


For example, when Q 


Nq a short calculation, see below, shows that 


^No(si,S 2 ,...) = exp 


( Z s */*) • 


Indeed, for any permutation <r the series is an element of the space Nq N ^ of all sequences 

in No with finite support. Conversely, to any element m = (mj)j e ^ £ Ng correspond only 
permutations of order n = * m i and their number is given by n\/ n,:>i( m *! i mi ). Hence 


Z N 0 = 


e n 

'- 1 


jrii 


mi ! i mi 


nE 

i> 1 rrii> 0 


J"Jexp(s;/i) =exp(^Sj/i). 
i> 1 i> 1 


We may now express the ordinary generating series for a multiset of objects. This result is 
implicit in Harary and Palmer m as an application of Polya’s Enumeration Theorem; see 
also Joyal [21f Prop. 9] for a formulation in a more general setting. 


Proposition 3.2. The ordinary generating series of a multiset class MSETq(C) is given by 


MSET n (C)(z) = Z n (C(z),C(z 2 ),C(z 3 ),...). 


In particular, 

MSET(C)(z) = exp (EGA)/*). 

k> 1 


3.4. Boltzmann samplers. Given a nonempty combinatorial class C and a parameter x > 
0 with C(x) < oo, we may consider the corresponding Boltzmann distribution on C that 
assigns probability weight x\ c \/C(x) to any element C £ C. A Boltzmann sampler TC(x) is 
a stochastic process that generates elements from C according to the Boltzmann distribution 
with parameter x. There are various rules according to which we may construct such samplers. 


3.4.1. Product. Let C and T> be nonempty combinatorial classes and x > 0 such that C(x),T>(x) 
are finite. Then a Boltzmann sampler T(C • T>)(x) for the product is given as follows. 

1. Draw an element C £ C using a Boltzmann sampler rC(x). 

2. Draw independently an element D £ T> using a Boltzmann sampler TV(x). 

3. Return the pair ( C,D). 

See, for example, in Section 2 of [16] for the (simple) justification. 

3.4.2. Multiset. Let C be a nonempty combinatorial class and D C No a subset. Then for any 
parameter x > 0 with MSETq(C)(x) < oo a Boltzmann sampler rMSETn(C)(a;) is given as 
follows. 

1. Draw a permutation a from [Jfcen such that for each k £ D and v £ Sk 

F(a = v) = -*-C(x) ul C(x 2 y 2 ■ ■ ■C(x k Y k / MSET n {C)(x). 

k\ 
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2. For each cycle r of a let |r| denote its length. Draw a random graph C T using a 
Boltzmann sampler rC(xl T l). 

3. Return the multiset of C-objects that contains any C £ C precisely )T) t c =C l r l ti mes ) 
with the index r ranging over all cycles of a. 

For a proof see 0 Prop. 38] and [16, Thm. 4.2], 

4. Random Polya trees 

4.1. Combinatorial decomposition of Polya trees. Let SI C No be a subset containing 
0 and at least one integer > 2. Let An denote the combinatorial class of Polya trees with 
vertex outdegrees in Q. Any Polya tree A is uniquely determined by the multiset 

M{A) = {(Ai,m),..., ( A k ,n k )} 

of smaller Polya trees obtained by removing the root vertex of A. The tree A has vertex 
outdegrees in D if and only if the number of elements of the multiset lies in fi, and if each 
of its elements A* belongs to An- Thus, letting X = {o} denote the combinatorial class 
constisting of a single object with size 1, the map 

(4.1) An ^ A • MSET(Aq), A^(o,M(A)) 

is a size-preserving bijection. Using Proposition |3.2[ this yields the equation 

(4.2) An(z) = zZ n {An(z),An(z 2 ), ...). 

4.2. Enumerative properties. In this section we collect basic analytic facts regarding Polya 
trees, which are frequently used in the proofs of the main theorems. The following result is 
obtained by applying a general enumeration theorem due to Bell, Burris and Yeats [6] Thm. 
75]. Special cases such as for trees with less general vertex-degree restrictions are classical 
combinatorial results, see e.g. m Thm. VII.4] but also Polya [29] and Otter [27]. We do 
provide an explicit proof for the readers convenience, but do not claim novelty of this result. 
Although it does not seem to be explicitly stated in this generality in the literature, it is 
implicit in the work [6] and the present proof summarizes the corresponding arguments. 

Proposition 4.1. Let pn denote the radius of convergence of the ordinary generating func¬ 
tion An{z). Then the following holds. 

i) We have that 0 < pn < 1 and 0 < An(pn) < oo. 

ii) For some e > 0, the function E(z,w) = zZn{w, An{z 2 ), Tq(z 3 ), ...) satisfies 

E(p n + e, An{pn) + e) < oo. 

Hi) For some constant dn > 0, the number of Polya trees with n vertices and outdegrees in D 
is given by 

[z n ]An(z) ~ dnn~ 3 / 2 pn n . 

Proof. We start with the proof of i). The series Aq(z) is dominated coefficentwise by the 
ordinary generating series A(z) = Anq^) of all Polya trees and it is known that A(z) is 
analytic at the origin (see e.g. DU Prop. VII.5] and [29][27])- Hence pn > 0. As formal power 
series we have by ( |4.2| ) that Aq{z) = zZn(An(z), Aq(z 2 ), .. .). The coefficients of all involved 
series are nonnegative, hence we may lift this identity of formal power series to an identity of 
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real numbers. By assumption, 0 G 12 and there is an integer £ > 2 such that £ € 12. Thus, for 
all 0 < x < pn 

( 4 . 3 ) A n (x) >x{i + ^Yl Ai (®rA>(® 2 r • ■ ■ Ai(^r) 

with 5^ denoting the symmetric group of order ^ and o % denoting the number of cycles of 
length i of a. In particular, by considering the summand for a = id, we have that 

An(x) > x(An{x)) £ /£\. 

Since £ > 2 this implies that the limit lim x -[- PQ A (x) is finite and hence An(pn) is finite. 
Moreover, considering the summand in ( |4.3[ ) for a a cycle of length £ yields that 

oo > Mo(po) > Pn(An(pn e ))/£!• 

This implies that pn < 1 because otherwise A(pn £ ) = oo. If pn = 1, then 
that Mq( 1) > 1. Applying (*) yields 

Mn(l) > 1 + Mp(l), 

which is clearly impossible. Hence our premise cannot hold and thus po < 1. We proceed 
with showing ii). We have that An(z) = E(z,An(z)). The series E n (z,w) is dominated 
coefficient-wise by 

zexp(w + y ^An(z l )/i)- 

i> 2 

Since pn < 1 it follows that there is an e > 0 such that E(pn + e, An(pn) + e) < oo. This 
establishes ii). To see the last claim, by a general enumeration result given in [6[ Thm. 28] it 
follows that 

[z m ]Aa(z) ~ gcd(fl) J PnMPthAn(pn)) -m m - 3 / 2 m = x mod gcd(fi). 

V znE ww (p n ,An{pn)) 

□ 


(4.3) would imply 


4.3. A Boltzmann sampler for random Polya trees. Let SI C No denote a subset 
containing 0 and at least one integer > 2. Recall that we let A®, denote the combinatorial 
class of Polya trees with vertex outdegrees in S4. 

The size-preserving bijection in (4.1) between the classes Mo and X ■ MSETo(Mo), where 
each tree corresponds to the multiset of trees pendling from its roots, allows us to construct 
a Boltzmann sampler TMo for Polya trees. The Boltzmann distribution is a measure on 
Polya trees with an arbitrary number of vertices. However, any tree with n vertices has the 
same probability, i.e., the distribution conditioned on the event that the generated tree has 
n vertices is uniform. This will allow us to reduce the study of properties of a random Polya 
tree with exactly n vertices to the study of TMo- 


Lemma 4.2. The following recursive procedure rMo(^) terminates almost surely and draws 
a random Polya tree with outdegrees in 11 according to the Boltzmann distribution with pa¬ 
rameter 0 < x < pn, i.e. any object with n vertices gets drawn with probability x n / Aq{x). 

1. Start with a root vertex v. 







SCALING LIMITS OF RANDOM POLYA TREES 


9 


2. Let a{v) be a random permutation drawn from the union of permutation groups Ufcen $k 
with distribution given by 

F(a(v) = v) = -^^l-An(xrAn(x 2 r ■ • • A a {x k T k 
Aq{x) k\ 

for each k G Ll and v G S^. Here denotes the number of cycles of length i of the 
permutation v. In particular, u\ is the number of fixpoints of u. 

3. If a(v) G Sq, then return the tree consisting of the root only and stop. Otherwise, for each 
cycle t of (j(y) let I T > 1 denote its length and draw a Polya tree A T by an independent 
recursive call to the sampler TAo,{x l " r ). Make t r identical copies of the tree A r and connect 
their roots to the vertex v by adding edges. Return the resulting tree and stop. 


A Boltzmann-sampler for IVl(x) is also explicitly described in [8j Fig. 14, (1)]. (Note that 
the exposition given there contains a typo, as it corresponds to attaching only one copy of 
each tree A r in Step 3.) 

We would like to justify the above procedure by applying the rules in Section 3H for 
obtaining samplers for products and multisets. Indeed, the product rule states, that a sampler 
T^Iq(x) may be obtained by taking a root-vertex v (which correspons to calling FX(x)), 
calling the multiset sampler rMSETo(.Af 2 )(x), and constructing a tree by connecting v with 
the root-vertices of the obtained trees. The rule for multiset classes yields a procedure for 
rMSETn(„4,o)(x) that involves calls to r„4,o(x fc ) for several k > 1. If we interpret these calls 
as independent copies of Boltzmann distributed random variables, then the rules stated in 
Section |3.4| guarantee that the resulting random Polya tree follows a Boltzmann distribution 
with parameter x. However, this procedure is not ’’explicit”, as we do not specify how to 
obtain these copies. Hence, instead, we interpret the indepent calls T^4o(x) as recursive 
calls to our constructed procedure, i.e. each call corresponds to again taking a root vertex 
and choosing (independently) multisets from MSETn(Ao), which again may cause further 
recursive calls. This is similar to a branching process. Of course, we need to justify that 
this recursive procedure terminates almost surely and samples according to a Boltzmann 
distribution with parameter x. This justification is given in n Thm. 4.2] in a more general 
context for classes that may be recursively specified as in (4.1) using operations such as 
products and multiset classes. 


4.4. Deviation Inequalities. We will make use of the following moderate deviation inequal¬ 
ity for one-dimensional random walks found in most textbooks on the subject. 

Lemma 4.3. Let (W)ieN be family of independent copies of a real-valued random variable X 
with E[X] = 0. Let S n = X i + ... + X n . Suppose that there is a 5 > 0 such that E[e 0X ] < oo 
for \9\ < 5. Then there is a c > 0 such that for every 1/2 < p < 1 there is a number N such 
that for all n > N and 0 < e < 1 

F(\S n /n p \ > e) < 2exp(—ce 2 n 2p_1 ). 


5. Proof of the main theorem 

In the following H will always denote a set of nonnegative integers containing zero and 
at least one integer greater than or equal to two. Moreover, n will always denote a natural 
number that satisfies n = 1 mod gcd(fi) and is large enough such that rooted trees with n 
vertices and outdegrees in Fl exist. 
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Proof of Theorem We begin the proof with a couple of auxiliary observations about the 
sampler TMn(x) from Lemma 4.2 Let us fix x = pq throughout. We may do so, since by 


Proposition |4.1| we have that 0 < pq < 1 and An(pn) < oo. 

Suppose that we modify Step 1 to ’’Start with a root vertex v. If the argument of the 
sampler is p^ (as opposed to p ^ for some i > 2), then mark this vertex with the color blue.”. 
Then the resulting tree is still Boltzmann-distributed, but comes with a colored subtree which 
we denote by T. 

Note that T is distributed like a Galton-Watson tree without the ordering on the offspring 
sets. By construction, the offspring distribution £ of T is given by the number of fixpoints of 
the random permutation drawn in Step 2. Thus, the probability generating function of £ is 

Pn 


(5.1) 


E[z ? ] = 


-Zn(zAn{pn),An(p n ),An(pn), ■ ■ •)■ 


An(pn) 

Moreover, for any blue vertex v we may consider the forest F(v) of the trees dangling from v 
that correspond to cycles of the permutation cr(v) with length at least two. Let £ denote 
a random variable that is distributed like the number of vertices |.F(u)| in F(v). Then the 
probability generating function of £ is 


(5.2) 


E[/] = 


Pn 


-Zn(An(pn), An((zpn) 2 ), An((zpn ) 3 ),...) 


Using Proposition 


4.1 


An(pn) 

it follows that the generating functions E[^] and E[z^] have radius of 
convergence strictly larger than one. Hence £ and £ have finite exponential moments. In 
particular, there are constants c, d > 0 such that for any s > 0 

(5.3) P (£ > s) < ce~ c s and P (£ > s) < ce~ c s . 

Moreover, as we argue below, £ has average value 

q \ 

(An(pn),An(ph), ■ ■ -)pn = 1 - 


E K ] = 

This can be shown as follows. Recall that the ordinary generating series satisfies the iden¬ 
tity An(z) = E(z, An(z)) with the series E(z,w) given by 

E(z,w) = zZ n (w,An(z 2 ),An(z 3 ),...). 

In particular, we have that F(z, An(z)) = 0 with F(z,w) = E(z,w) — w. Suppose that 
( t £ j F)(p,An(p)) A 0 . Then by the implicit function theorem the function An{z) has an 
analytic continuation in a neighbourhood of p^. But this contradicts Pringsheim’s theorem 
m Thm. IV.6], which states that the series Aq(z) must have a singularity at the point p q 
since all its coefficients are nonnegative real numbers. Hence we have (J^F)(p, An(p)) = 0 
which is equivalent to E[£] = 1. 

With all these facts at hand we proceed with the proof of the theorem. Slightly abusing 
notation, we let A n denote the colored random tree drawn by conditioning the (modified) 
sampler rMo(/9n) ° n having exactly n vertices. That is, if we ignore the colors, A n is drawn 
uniformly among all Polya trees of size n with outdegrees in H. Moreover, let 7£ denote the 
colored subtree of A n , and for any vertex v of T n let F n (v) denote the corresponding forest 
that consists of non-blue vertices. We will argue that with high probability there is a constant 
C > 0 such that |_F n (?;)| < Clogn for all v E T n ■ Indeed, note that by Proposition 4.1 


(5.4) 


¥ (\TA n (pn)\ = n) = 


Pn 


An(pn) 


[z n }A n (pn) = 0(n" 3 / 2 ), 
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i.e. the probability is (only) polynomially small. Thus, for any s > 0, if we denote by Cl, C 2 , • ■ • 
independent random variables that are distributed like £ 

P (3v € Tn : \F n (v)\ >s)=F(3veT-. |F(u)| > s | |rAi(pn)| = n) 

< 0(n 3 ^ 2 )P (31 < i < n : Q > s). 


Using (5.3) and setting s = Clogn we get that P (Ci > s) = o{n 5 / 2 ) for an appropriate choice 
of C > 0. Thus, by the union bound 


(5.5) 


P (Vu E Tn : | F n (v)\ < Clogn) = 1 - o(l). 


The typical shape of A n thus consists of a colored 
tree with small forests attached to each of its ver¬ 
tices, compare with Figure[l] In particular, we have 
that the Gromov-Hausdorff distance between the 
rescaled trees A n / y/n and T n /\/n converges in prob¬ 
ability to zero. We are going to show that there is 
a constant cq > 0 such that coFn/y/n converges 
weakly towards the Brownian continuum random 
tree T e ■ This immediately implies that 

cn A n /y/n^-K T e 

and we are done. 



Figure 1. The typical 
shape of the random Polya 
tree with n vertices. 


We are going to argue that the number of vertices in F n concentrates around a constant 
multiple of n. More precisely, we are going to show that for any exponent 0 < s < 1/2 we 
have with high probability that 


(5.6) 


%| G (1 ±n~ s ) 


n 

i+E[cr 


To this end, consider the corresponding complementary event in the unconditioned setting 


\T\ i(l±n~ s ) 


|rA?(pn)| 

1+E[C] 


If this occurs, then we clearly also have that 


£(1 + TMI) = |rAi(pn)l i (1 ± 0 (n-))(l +E[<])in. 

V&T 


Let £ denote the corresponding event. From (5.5) we know that with high probability |F n (n)| = 
O(logra) for all vertices v of T n ■ Hence, with high probability, say, \T n \ > n/log 2 n. Using 
again (5.4) 


P(£ I \TAn(pn)\=n) = 0(n' 


3 /2)p f 


n 


\ log^ n 


< \T\ < n,£ + o(l). 


By applying the union bound, the latter probability is at most 


E p 

n/log 2 n<i<n 


+ (l±e(n-))(l+E[C])< 
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Since the random variable ( has finite exponential moments, we may apply the deviation 
inequality in Lemma 4.3 in order to bound this by o(l). Hence, (5.6) holds with probability 
tending to 1 as n becomes large. We are now going to prove that 


(5.7) 


vTTTEic])^ (d\ <1 _ 

2^ >n /e 


with a 2 > 0 denoting the variance of the random variable £. This implies that 


(5.8) 


co A, 


n — >T e 


with 


cn = 


V / a + ElCl)a 


and we are done. Note that a and E[£] may be computed explicitly from the expression of 
the probability generating functions in (5.1) and (5.2). We obtain that a 2 is given by 


a' 




d 2 Z n , 


= PnAn(pn)-Q^r(An{pn),An(pn), 
+ Pn-^-(-An(pn),An(pn), ■■■) 


ds i 


Pn 


(dZn 

V dsi 


(An(pn),An(pn),... 


Note that a > 0, as £ is not constant. Moreover, 

d 


E[<] = (Ae[/]) (1) = 


Pn 


An(pn) 


E 

i> 2 


) (An{pn),An{pn ),...) iphA' n (ph), 


where Aq(z) = JpAn(z). Note that this expression is well-defined, since 0 < po < 1. 

In order to show ( |5.8| ), let / : K —> M denote a bounded, Lipschitz-continous function 
defined on the space K of isometry classes of compact metric spaces. Note that the tree T n 
conditioned on having l vertices is distributed like the tree T conditioned on having l vertices. 
In particular, it is identically distributed to a £-Galton-Watson tree T^ conditioned on having t 
vertices, which we denote by Tf. Since (5.6) holds with high probability it follows that 

E[/(cnT„/v^)] = o(l) + £ E[/(cn if/V^W (|T| = i) ■ 

^6(1 ±n s ) i + E[c] 

Let D(T) denote the diameter of T, i.e., the number of vertices on a longest path in T. Since / 
was assumed to be Lipschitz-continuous it follows that 

E[/(cn7?/v^)] -E[f(aT^/2Vi)}\ < a n/ E[D(Tf)/Vl\ 

for a sequence a n ^ with sup^(a n ^) —> 0 as n becomes large. Moreover, the average rescaled 
diameter E[D {if)/Vi] converges to a multiple of the expected diameter of the CRT T e as l 
tends to infinity, see e.g. [T] . In particular, it is a bounded sequence. Since 

E[f{orf/2Vi)\ -A E[/(Te)] 

as £ -A oo, it follows that 

E[/(cnT„/v^)] -a E[/(T e )] 
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as n becomes large. This completes the proof. 


□ 


Proof of Theorem \l.S\ We are going to use the notation of the previous proof. Let x > 0 be 
given. Without loss of generality, we may assume throughout that x > y/n. If the height 
H(A„) of the tree A n satisfies H(A n ) > x, then H(7/) > x/2 or \F n (v)\ > x/2 for at least 
one vertex v E T n - We are going to bound the probability for each of these events separately. 
By the tail bounds for conditioned Galton-Watson trees in Theorem 2.2 there exist constants 
Ci, ci >0 such that for all l and y > 0 we have that 

P (H(T) >y\\T\=£)<C 1 exp(—ciy 2 /£). 

Moreover, T n conditioned on having size l is distributed like T conditioned on having size t. 
Hence 

n 

(5.9) P (H(7;) > x/2) = ^ P (\Tn\ = t) P (H(T) > x/2 \ \T\ = £) < Ci exp(-ci® 2 /(4n)). 

l=i 


By ( |5.4[ ) it holds that 

(max\F n (v)\ > x/ 2 ) < 0(n 3/2 )P (max|F(x)| > x/2, |T^4n(pn)| = n 


max | 
\veT„ 


\v£ T 

< 0(n 5 / 2 )P(C > x/2) 


As we assumed that x > y/n, it follows by (5.4) that there are constants c, c', d' > 0 with 
n 5 / 2 P(C > x/ 2 ) < cexp((5/2) logn — dx/2) < cexp(— d'x). 

Consequently, there are constants C 2 ,C 2 >0 with 


(5.10) 


P ( max \F n (v)\ > x/2 ) < C 2 exp(— c^x 2 /n). 
\veT„ J 


Combining Inequalities (5.9) and (5.10) yields 

P(H(A„) > x) < Ci exp(—cix 2 /(4n)) + C 2 exp(—C 2 X 2 /n) < C 3 exp(— c^x 2 /n) 

for some constants Ci,C 3 > 0. This concludes the proof of the tail-bound for the height. 

In order to show the tail-bound for the width we begin with some auxiliary observations. 
Note that (5.3) guarantees that there are c G (0,1) and d > 0 such that for any t, t' € N 

P (C1(C > t) > t') < dd' and E[^1(C > t)~\ < dtd , 

where 1 (C) denotes the indicator function for the event E. Let Ci 3 C2 j - - - be independent 
random variables with the same distribution as f and define 


(5.11) 


c = ECi!(c>o- 

i> 1 


The previous observation implies that there is a C > 0 such that E[</] < C and ( is finite 
almost surely. Let pi = P (£ < i) and for a series F(z) write F- l (z ) = 'f2j> t [d]F(u)zd 
Setting F(z) = E[^], from (|5.2[) we get that 


e[/]= n ( Pi + f*(z))= n (1+ ^ - 1)+ f-\z)) 


i> 1 


i> 1 
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Note that for z > 1 we have that (pi — 1) + F- l (z) > 0. Our assumption asserts that there 
is a p > 1 such that the radius of convergence of all F- 1 ' s equals p. Then, for any zo G (1 , p) 
we have for some A > 0 and a G (0,1) that F- l (zo) < Aa l for all i G No- Thus 

£l(W"l)+ F~\zo)\ < E[C] + E < °°- 

i>l i>l 


Since E[z^] has only non-negative coefficients, it follows that the radius of convergence of 
E[^] is larger than one, and thus ( has exponential moments as well. We thus infer that we 
even may choose c G (0,1) and d > 0 such that for any t' G N 

(5.12) P(c>*') <c'd’. 


since T n is a conditioned 


With all these facts at hand we proceed with the proof for the t ail b ound of the width. With 
foresight, set a = l + E[£]/2 > 0. First, note that with Theorem 
£-Galton Watson tree with at most n vertices 

(5.13) P (W(A„) > x) < P (W(A„) > x and W(T n ) < x/a) + C 4 exp(— c^x 2 /n) 


for some 04 , 6*4 > 0; here and in the sequel we identify A„ with T^Iq(pq) conditional on 
|r»4fi(po)| = n, and T n is its colored subtree. Let Lj be the set of vertices in T n with 
distance d from the root, and define Wd = |L^|. Conditional on u W(T n ) < x/a” we have 
that Wi < x/a for all 1 < i < n (and, of course, W) = 0 for all other n). Now note that the 
number of vertices at distance d from the root of A n is bounded from above by 


d -1 

U d := W d + E E \ F n(v)\l(\F n {v)\>d-i), 

i =0 v^lLi 

since a forest F n {v) with v G L{ cannot contribute to the set of depth d vertices in A n unless 
it has at least that d — i vertices. Thus 


P(W(A„) > x and W(T n ) < x/a) < P (31 < d < n : U d > x and W(T n ) < x/a) 


Using the union bound, (5.4) and the definition in (5.11), we infer that there is a C 5 > 0 such 
that the latter probability is bounded by 

/ x/a 

C 5 n 5/2 F I E Ci > (! - l/a)x 


where Cl 5 C 2 ? - ■ • are iid with the same distribution as </■ 


i=\ 


Note that with the choice of a we have that the expectation of th e sum equals (1 — l/a)x/2. 
Since £ has exponential moments, see (5.12), by applying Lemma 4.3 we infer that there are 
constants cq, Cq > 0 such that 


'(31 < d < n : U d > x and W(T n ) < x/a) < C§e C63 


and the proof is completed with (5.13). 


□ 
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